Video Semantic Segmentation


Video semantic segmentation is the process of segmenting objects in videos into different classes or categories.

Diffusion-aided Extreme Video Compression with Lightweight Semantics Guidance

Add code
Feb 05, 2026
Viaarxiv icon

E.M.Ground: A Temporal Grounding Vid-LLM with Holistic Event Perception and Matching

Add code
Feb 05, 2026
Viaarxiv icon

UniSurg: A Video-Native Foundation Model for Universal Understanding of Surgical Videos

Add code
Feb 05, 2026
Viaarxiv icon

SlowFocus: Enhancing Fine-grained Temporal Understanding in Video LLM

Add code
Feb 03, 2026
Viaarxiv icon

Audit After Segmentation: Reference-Free Mask Quality Assessment for Language-Referred Audio-Visual Segmentation

Add code
Feb 03, 2026
Viaarxiv icon

Multi-Objective Optimization for Synthetic-to-Real Style Transfer

Add code
Feb 03, 2026
Viaarxiv icon

MLV-Edit: Towards Consistent and Highly Efficient Editing for Minute-Level Videos

Add code
Feb 02, 2026
Viaarxiv icon

Sem-NaVAE: Semantically-Guided Outdoor Mapless Navigation via Generative Trajectory Priors

Add code
Feb 01, 2026
Viaarxiv icon

InspecSafe-V1: A Multimodal Benchmark for Safety Assessment in Industrial Inspection Scenarios

Add code
Jan 29, 2026
Viaarxiv icon

LEMON: How Well Do MLLMs Perform Temporal Multimodal Understanding on Instructional Videos?

Add code
Jan 27, 2026
Viaarxiv icon